Overview
The process_earnings_performance.py script analyzes stock price performance following quarterly earnings announcements. It uses company filings data to identify earnings release dates and calculates returns since the announcement, accounting for market timing (pre-market vs. post-market releases).
Purpose
This script adds earnings-related performance tracking by:
- Extracting latest quarterly results announcement date from regulatory filings
- Implementing smart benchmarking that handles pre-market and post-market announcements
- Calculating returns from the earnings date to the current price
- Calculating maximum returns achieved since the earnings announcement
all_stocks_fundamental_analysis.json
Master analysis file. This is both input and output.
Directory containing JSON files with regulatory filings for each company. Files named as {SYMBOL}_filings.json.
Directory containing OHLCV CSV files for each stock to calculate price performance.
Company Filings JSON Structure
{
"data": [
{
"descriptor": "Financial Results",
"news_date": "2026-01-27 20:17:25",
"caption": "Outcome of Board Meeting",
"file_url": "https://..."
}
]
}
Output Produced
all_stocks_fundamental_analysis.json
Updates the master analysis file by adding earnings performance fields.
Processing Logic
Identifies the latest earnings announcement from filings:
def get_earnings_info(filing_path):
"""Extract latest results date and time"""
try:
with open(filing_path, "r") as f:
data = json.load(f)
filings = data.get("data", [])
# Filter for Financial Results
results = [f for f in filings if f.get("descriptor") == "Financial Results"]
if not results: return None, None
# Sort by date and get latest
results.sort(key=lambda x: x.get("news_date", ""), reverse=True)
return results[0].get("news_date", ""), results[0].get("descriptor")
except Exception:
return None, None
2. Smart Benchmarking Logic
Handles pre-market vs. post-market announcements differently:
def calculate_earnings_metrics(csv_path, earnings_news_date):
# Parse news date and time
date_part = earnings_news_date.split(" ")[0]
time_part = earnings_news_date.split(" ")[1] if " " in earnings_news_date else "00:00:00"
target_date = pd.to_datetime(date_part)
hour = int(time_part.split(":")[0])
minute = int(time_part.split(":")[1])
# Load OHLCV data
df = pd.read_csv(csv_path)
df['Date'] = pd.to_datetime(df['Date'])
latest_price = df.iloc[-1]['Close']
# Determine if news hit after market hours
# In India, market closes at 15:30
is_after_hours = (hour > 15) or (hour == 15 and minute >= 30)
3. Benchmark Price Selection
Selects appropriate base price based on announcement timing:
if is_after_hours:
# Post-market announcement: Benchmark is close of announcement day
pre_news_df = df[df['Date'] <= target_date]
post_news_df = df[df['Date'] > target_date]
else:
# Pre-market/during-market: Benchmark is close BEFORE announcement
pre_news_df = df[df['Date'] < target_date]
post_news_df = df[df['Date'] >= target_date]
if pre_news_df.empty or post_news_df.empty:
# Fallback handling
if post_news_df.empty: return 0.0, 0.0
base_price = post_news_df.iloc[0]['Close']
else:
base_price = pre_news_df.iloc[-1]['Close']
4. Returns Calculation
Calculates both current and maximum returns:
# 1. Returns since Earnings (%)
returns_since = ((latest_price - base_price) / base_price) * 100
# 2. Max Returns since Earnings (%)
max_high = post_news_df['High'].max()
max_returns = ((max_high - base_price) / base_price) * 100
return round(returns_since, 2), round(max_returns, 2)
5. Master Data Update
Updates all stocks with earnings metrics:
def main():
# Load master data
with open(MASTER_JSON, "r") as f:
analysis_data = json.load(f)
for stock in analysis_data:
symbol = stock.get("Symbol")
filing_file = os.path.join(FILINGS_DIR, f"{symbol}_filings.json")
ohlcv_file = os.path.join(OHLCV_DIR, f"{symbol}.csv")
# Get earnings info
earnings_news_date, _ = get_earnings_info(filing_file)
stock["Quarterly Results Date"] = earnings_news_date.split(" ")[0] if earnings_news_date else "N/A"
# Calculate metrics
ret, max_ret = calculate_earnings_metrics(ohlcv_file, earnings_news_date)
stock["Returns since Earnings(%)"] = ret
stock["Max Returns since Earnings(%)"] = max_ret
# Save updates
with open(MASTER_JSON, "w") as f:
json.dump(analysis_data, f, indent=4)
Fields Added/Modified
This script adds the following fields to each stock record:
Earnings Timing
- Quarterly Results Date: Date of latest earnings announcement (YYYY-MM-DD format)
- Returns since Earnings(%): Percentage return from earnings announcement to current price
- Max Returns since Earnings(%): Maximum percentage return achieved since earnings announcement
Use Cases
1. Earnings Reaction Analysis
Track immediate and sustained market reaction to quarterly results:
Stock: RELIANCE
Quarterly Results Date: 2026-01-27
Returns since Earnings: +12.5%
Max Returns since Earnings: +18.3%
2. Post-Earnings Drift Identification
Identify stocks showing continued momentum after earnings:
- If current returns ≈ max returns → sustained momentum
- If current returns < max returns → pullback from peak
3. Earnings Calendar Integration
Provides context for recent price movements by showing proximity to earnings events.
Smart Benchmarking Examples
Example 1: Post-Market Announcement
Earnings Date: 2026-01-27 20:17:25 (8:17 PM)
Market Close: 15:30 (3:30 PM)
Status: After Hours ✓
Benchmark: Close of 2026-01-27
First Reaction Day: 2026-01-28
Example 2: Pre-Market Announcement
Earnings Date: 2026-01-27 08:30:00 (8:30 AM)
Market Open: 09:15 (9:15 AM)
Status: Pre-Market ✓
Benchmark: Close of 2026-01-26 (day before)
First Reaction Day: 2026-01-27
Example 3: During-Market Announcement
Earnings Date: 2026-01-27 14:00:00 (2:00 PM)
Market Hours: 09:15 - 15:30
Status: During Market ✓
Benchmark: Close of 2026-01-26 (day before)
First Reaction: Intraday 2026-01-27
Code Example
process_earnings_performance.py
import json
import pandas as pd
from datetime import datetime
def get_earnings_info(filing_path):
with open(filing_path, "r") as f:
data = json.load(f)
filings = data.get("data", [])
results = [f for f in filings if f.get("descriptor") == "Financial Results"]
if not results: return None, None
results.sort(key=lambda x: x.get("news_date", ""), reverse=True)
return results[0].get("news_date", ""), results[0].get("descriptor")
def calculate_earnings_metrics(csv_path, earnings_news_date):
if not earnings_news_date:
return 0.0, 0.0
# Parse timing
date_part = earnings_news_date.split(" ")[0]
time_part = earnings_news_date.split(" ")[1]
hour = int(time_part.split(":")[0])
minute = int(time_part.split(":")[1])
# Determine benchmark
is_after_hours = (hour > 15) or (hour == 15 and minute >= 30)
df = pd.read_csv(csv_path)
df['Date'] = pd.to_datetime(df['Date'])
target_date = pd.to_datetime(date_part)
if is_after_hours:
pre_news_df = df[df['Date'] <= target_date]
post_news_df = df[df['Date'] > target_date]
else:
pre_news_df = df[df['Date'] < target_date]
post_news_df = df[df['Date'] >= target_date]
base_price = pre_news_df.iloc[-1]['Close']
latest_price = df.iloc[-1]['Close']
max_high = post_news_df['High'].max()
returns_since = ((latest_price - base_price) / base_price) * 100
max_returns = ((max_high - base_price) / base_price) * 100
return round(returns_since, 2), round(max_returns, 2)
Function Reference
get_earnings_info(filing_path)
Extracts latest earnings announcement date from company filings.
Parameters:
filing_path: Path to the company’s filings JSON file
Returns: Tuple of (news_date, descriptor) or (None, None) if not found
calculate_earnings_metrics(csv_path, earnings_news_date)
Calculates performance metrics relative to earnings announcement.
Parameters:
csv_path: Path to stock’s OHLCV CSV file
earnings_news_date: Earnings announcement timestamp (format: “YYYY-MM-DD HH:MM:SS”)
Returns: Tuple of (returns_since, max_returns) in percentage terms
main()
Processes all stocks and updates master JSON with earnings metrics.
Returns: None (writes output to JSON file)
- Processing Speed: ~2,000 stocks processed in 5-10 seconds
- Sequential Processing: No parallelization (could be optimized)
- Error Handling: Gracefully handles missing filings or OHLCV data
- Date Parsing: Robust handling of various timestamp formats
Dependencies
json: JSON file handling
os: File path operations
glob: File pattern matching
pandas: DataFrame operations and date handling
datetime: Date parsing and manipulation
Important Notes
- Dependency Chain: Must run after
advanced_metrics_processor.py
- Filing Source: Relies on “Financial Results” descriptor in filings
- Market Hours: Assumes Indian market hours (9:15 AM - 3:30 PM)
- Time Zone: All timestamps should be in IST
- Fallback Logic: Returns 0.0 for both metrics if data unavailable
Source File Location
process_earnings_performance.py:1-110